Skip to content

Comments

Slice Access to Regions of Memory + Pre-Decode Program #2

Closed
ec2 wants to merge 11 commits intoec2/bitlistfrom
ec2/alloc-preheap
Closed

Slice Access to Regions of Memory + Pre-Decode Program #2
ec2 wants to merge 11 commits intoec2/bitlistfrom
ec2/alloc-preheap

Conversation

@ec2
Copy link
Owner

@ec2 ec2 commented Mar 3, 2025

Memory accesses are expensive mainly due to the fact that we have to access go's map. Indexing into a slice is much better. This method however does mean that we need to alloc more memory upfront. But since Cannon is tailored towards op-program, this shouldnt be a problem because the memory access pattern is predictable because we know how go's runtime allocates memory.

Because we know where the program section lies, we can also decode all the instructions before the VM starts, which according to measurements, will decrease reads into memory by 70%+.

@Inphi
Copy link

Inphi commented Mar 12, 2025

@clabby could you please chime in on this approach to memory lookups for a program like Kona? Any surprises?

@clabby
Copy link

clabby commented Mar 12, 2025

@clabby could you please chime in on this approach to memory lookups for a program like Kona? Any surprises?

I didn't read into this too deeply, but I think this would cause issues if it's considering the entirety of the loadable ELF sections as code sections. Kona slab-allocates its heap at compile time, and never mmaps, so the device heap is allocated when allocating the initial program memory during loading the ELF.

See: https://github.com/op-rs/kona/blob/4aace7adf6a700a0431bfce0dd12e29fed87813b/crates/proof/std-fpvm/src/malloc.rs#L1

@clabby
Copy link

clabby commented Mar 13, 2025

Yeah we should be able to make this work with kona, just needs some modifications to the LoadElf function.

kona's program headers:

Elf file type is EXEC (Executable file)
Entry point 0x1f62b0
There are 6 program headers, starting at offset 64

Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  PHDR           0x000040 0x0000000000010040 0x0000000000010040 0x000150 0x000150 R   0x8
  LOAD           0x000000 0x0000000000010000 0x0000000000010000 0x0be2a4 0x0be2a4 R   0x1000
  LOAD           0x0be2a4 0x00000000000cf2a4 0x00000000000cf2a4 0x28ca24 0x28ca24 R E 0x1000
  LOAD           0x34acc8 0x000000000035ccc8 0x000000000035ccc8 0x000d60 0x5f5efd0 RW  0x1000
  GNU_STACK      0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW  0x0
  ATTRIBUTES     0x34babb 0x0000000000000000 0x0000000000000000 0x000042 0x000042 R   0x1

The final PT_LOAD section, w/ RW flags set, is our heap. Notice filesz is about 3.424 kB, and memsz is ~100MB. @ec2, you can use those flags in your LoadElf function to determine whether or not the segment is RO.

@pauldowman
Copy link

ethereum-optimism#14606 has been merged! I think this PR was just waiting for that, does it need anything else?

@BlocksOnAChain
Copy link

@clabby @Inphi @ec2 - any other open comment on this one or we are ok to merge it?

@pauldowman
Copy link

@ec2 would you be able to make this PR against the optimism repo?

@ec2
Copy link
Owner Author

ec2 commented May 15, 2025

Hey sorry @pauldowman I'm bad at github notification 😅 . I am no longer working at Chainsafe I'm at Risc0. I will transfer this branch over to our friends to get this over the finish line

@BlocksOnAChain
Copy link

@pauldowman - we now have this pr on the monorepo: ethereum-optimism#16030
I added you as one of the reviewers.

github-merge-queue bot pushed a commit to ethereum-optimism/optimism that referenced this pull request Aug 18, 2025
* Abstract the Merkle representation

* Implement asterisc's MPT

* migrate tests for mpt

* migrate tests for mpt

* copied benchmarks from asterisc

* fix failed merge

* Avoid pagelookup twice during setword invalidation

* use uints

* hashpool

* alloc pages upfront

* remove clutter

* fix

* gomallocregion

* more readable mapped regions

* fix state json codec test

* fix for singlethread too

* fix op-challenger test

* Remove MPT implementation

* address comments

* fix benchmark

* Use uint64 and also reuse hasher and buffers

* clean

* clean up some

* fix range

* Address reviewer feedback: clarify region bounds and add instruction cache fallback

* Revert "Address reviewer feedback: clarify region bounds and add instruction cache fallback"

This reverts commit 2a19b22.

* Clarify region bounds and add fallback to memory decode if PC exceeds instruction cache

* Update cannon/mipsevm/multithreaded/mips.go

Co-authored-by: Inphi <mlaw2501@gmail.com>

* fixed size of preheap cache for binaryTreeMemory

* slice out of bounds fixed in GetWord and AllocPage, pagetest syntax fix

* fix program heap initialization for binary tree

* moved cache decoded from instrumented state declaration to doMipsStep

* added default and set region map size to newMemory

* added 4096 byte memory region option to all testing cases using VMFactory

* clean up

* cache_decode to be initialize on first mipstep only for single step test compatibility

* added heap region sizing for unit/fuzz tests

* Revert " cache_decode to be initialize on first mipstep only for single step test compatibility"

This reverts commit 47a7166.

* cache allocation in instrumentedState initialization, catch for stale cache in doMipsStep

* remove unit test logic from domipstep and added helper for cache update in tests

* merge migration changes from develop

* some test step boundaries in instrumented_test needed to be slightly incremented, added decoded cache updateing to difftester_test.go

* small changes to step count and mismatch error vs string fault

* cannon: Use tiny i-cache by default

* uses a small i-cache by default to accomodate unit tests. This
  prevents tests from running out of memory in CI.
* fixes fuzz tests to preload the i-cache with the test instruction.

* op-e2e: Limit max parallelism for fp tests

The fp tests now require alot of memory to run. Particularly when cannon
is used. So limit parallelism to avoid OOM.

* increase parallelism

---------

Co-authored-by: Eric Tu <tu.eric@hotmail.com>
Co-authored-by: Inphi <mlaw2501@gmail.com>
janjakubnanista pushed a commit to ethereum-optimism/optimism that referenced this pull request Aug 19, 2025
* Abstract the Merkle representation

* Implement asterisc's MPT

* migrate tests for mpt

* migrate tests for mpt

* copied benchmarks from asterisc

* fix failed merge

* Avoid pagelookup twice during setword invalidation

* use uints

* hashpool

* alloc pages upfront

* remove clutter

* fix

* gomallocregion

* more readable mapped regions

* fix state json codec test

* fix for singlethread too

* fix op-challenger test

* Remove MPT implementation

* address comments

* fix benchmark

* Use uint64 and also reuse hasher and buffers

* clean

* clean up some

* fix range

* Address reviewer feedback: clarify region bounds and add instruction cache fallback

* Revert "Address reviewer feedback: clarify region bounds and add instruction cache fallback"

This reverts commit 2a19b22.

* Clarify region bounds and add fallback to memory decode if PC exceeds instruction cache

* Update cannon/mipsevm/multithreaded/mips.go

Co-authored-by: Inphi <mlaw2501@gmail.com>

* fixed size of preheap cache for binaryTreeMemory

* slice out of bounds fixed in GetWord and AllocPage, pagetest syntax fix

* fix program heap initialization for binary tree

* moved cache decoded from instrumented state declaration to doMipsStep

* added default and set region map size to newMemory

* added 4096 byte memory region option to all testing cases using VMFactory

* clean up

* cache_decode to be initialize on first mipstep only for single step test compatibility

* added heap region sizing for unit/fuzz tests

* Revert " cache_decode to be initialize on first mipstep only for single step test compatibility"

This reverts commit 47a7166.

* cache allocation in instrumentedState initialization, catch for stale cache in doMipsStep

* remove unit test logic from domipstep and added helper for cache update in tests

* merge migration changes from develop

* some test step boundaries in instrumented_test needed to be slightly incremented, added decoded cache updateing to difftester_test.go

* small changes to step count and mismatch error vs string fault

* cannon: Use tiny i-cache by default

* uses a small i-cache by default to accomodate unit tests. This
  prevents tests from running out of memory in CI.
* fixes fuzz tests to preload the i-cache with the test instruction.

* op-e2e: Limit max parallelism for fp tests

The fp tests now require alot of memory to run. Particularly when cannon
is used. So limit parallelism to avoid OOM.

* increase parallelism

---------

Co-authored-by: Eric Tu <tu.eric@hotmail.com>
Co-authored-by: Inphi <mlaw2501@gmail.com>
leopoldjoy pushed a commit to leopoldjoy/optimism that referenced this pull request Aug 22, 2025
…eum-optimism#16030)

* Abstract the Merkle representation

* Implement asterisc's MPT

* migrate tests for mpt

* migrate tests for mpt

* copied benchmarks from asterisc

* fix failed merge

* Avoid pagelookup twice during setword invalidation

* use uints

* hashpool

* alloc pages upfront

* remove clutter

* fix

* gomallocregion

* more readable mapped regions

* fix state json codec test

* fix for singlethread too

* fix op-challenger test

* Remove MPT implementation

* address comments

* fix benchmark

* Use uint64 and also reuse hasher and buffers

* clean

* clean up some

* fix range

* Address reviewer feedback: clarify region bounds and add instruction cache fallback

* Revert "Address reviewer feedback: clarify region bounds and add instruction cache fallback"

This reverts commit 2a19b22.

* Clarify region bounds and add fallback to memory decode if PC exceeds instruction cache

* Update cannon/mipsevm/multithreaded/mips.go

Co-authored-by: Inphi <mlaw2501@gmail.com>

* fixed size of preheap cache for binaryTreeMemory

* slice out of bounds fixed in GetWord and AllocPage, pagetest syntax fix

* fix program heap initialization for binary tree

* moved cache decoded from instrumented state declaration to doMipsStep

* added default and set region map size to newMemory

* added 4096 byte memory region option to all testing cases using VMFactory

* clean up

* cache_decode to be initialize on first mipstep only for single step test compatibility

* added heap region sizing for unit/fuzz tests

* Revert " cache_decode to be initialize on first mipstep only for single step test compatibility"

This reverts commit 47a7166.

* cache allocation in instrumentedState initialization, catch for stale cache in doMipsStep

* remove unit test logic from domipstep and added helper for cache update in tests

* merge migration changes from develop

* some test step boundaries in instrumented_test needed to be slightly incremented, added decoded cache updateing to difftester_test.go

* small changes to step count and mismatch error vs string fault

* cannon: Use tiny i-cache by default

* uses a small i-cache by default to accomodate unit tests. This
  prevents tests from running out of memory in CI.
* fixes fuzz tests to preload the i-cache with the test instruction.

* op-e2e: Limit max parallelism for fp tests

The fp tests now require alot of memory to run. Particularly when cannon
is used. So limit parallelism to avoid OOM.

* increase parallelism

---------

Co-authored-by: Eric Tu <tu.eric@hotmail.com>
Co-authored-by: Inphi <mlaw2501@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants